Altitude Training: Strong Bounds for Single-Layer Dropout
نویسندگان
چکیده
Dropout training, originally designed for deep neural networks, has been successful on high-dimensional single-layer natural language tasks. This paper proposes a theoretical explanation for this phenomenon: we show that, under a generative Poisson topic model with long documents, dropout training improves the exponent in the generalization bound for empirical risk minimization. Dropout achieves this gain much like a marathon runner who practices at altitude: once a classifier learns to perform reasonably well on training examples that have been artificially corrupted by dropout, it will do very well on the uncorrupted test set. We also show that, under similar conditions, dropout preserves the Bayes decision boundary and should therefore induce minimal bias in high dimensions.
منابع مشابه
Knowledge, Attitude and Practices Regarding Extreme Environments and Cold Adaptation at Extreme Altitudes on the Himalayan Ranges
Introduction: Extreme-altitudes (5500 m/18045 ft and higher) pose environmental, psychophysiological, infrastructural, logistic, and ergonomic challenges that question explorer’s adaptability and mission-efficiency due to isolation, monotony, intimidating environment and terse health conditions. The assessment of an explorer’s comprehensive adaptability in extreme-altitudes is ...
متن کاملA PAC-Bayesian Tutorial with A Dropout Bound
This tutorial gives a concise overview of existing PAC-Bayesian theory focusing on three generalization bounds. The first is an Occam bound which handles rules with finite precision parameters and which states that generalization loss is near training loss when the number of bits needed to write the rule is small compared to the sample size. The second is a PAC-Bayesian bound providing a genera...
متن کاملSwapout: Learning an ensemble of deep architectures
We describe Swapout, a new stochastic training method, that outperforms ResNets of identical network structure yielding impressive results on CIFAR-10 and CIFAR100. Swapout samples from a rich set of architectures including dropout [17], stochastic depth [6] and residual architectures [4, 5] as special cases. When viewed as a regularization method swapout not only inhibits co-adaptation of unit...
متن کاملUnderstanding Dropout
Dropout is a relatively new algorithm for training neural networks which relies on stochastically “dropping out” neurons during training in order to avoid the co-adaptation of feature detectors. We introduce a general formalism for studying dropout on either units or connections, with arbitrary probability values, and use it to analyze the averaging and regularizing properties of dropout in bot...
متن کاملOn Binary Classification with Single-Layer Convolutional Neural Networks
Convolutional neural networks are becoming standard tools for solving object recognition and visual tasks. However, most of the design and implementation of these complex models are based on trail-and-error. In this report, the main focus is to consider some of the important factors in designing convolutional networks to perform better. Specifically, classification with wide single–layer networ...
متن کامل